An Improved Policy Iteratioll Algorithm

نویسنده

  • A. Hansen
چکیده

A new policy iteration algorithm for partially observable Markov decision processes is presented that is simpler and more efficient than an earlier policy iteration algorithm of Sondik (1971,1978). The key simplification is representation of a policy as a finite-state controller. This representation makes policy evaluation straightforward. The paper's contribution is to show that the dynamic-programming update used in the policy improvement step can be interpreted as the transformation of a finite-state controller into an improved finite-state controller. The new algorithm consistently outperforms value iteration as an approach to solving infinite-horizon problems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Improved Imperialist Competitive Algorithm based on a new assimilation strategy

Meta-heuristic algorithms inspired by the natural processes are part of the optimization algorithms that they have been considered in recent years, such as genetic algorithm, particle swarm optimization, ant colony optimization, Firefly algorithm. Recently, a new kind of evolutionary algorithm has been proposed that it is inspired by the human sociopolitical evolution process. This new algorith...

متن کامل

An Improved Algorithm for Network Reliability Evaluation

Binary Decision Diagram (BDD) is a data structure proved to be compact in representation and efficient in manipulation of Boolean formulas. Using Binary decision diagram in network reliability analysis has already been investigated by some researchers. In this paper we show how an exact algorithm for network reliability can be improved and implemented efficiently by using CUDD - Colorado Univer...

متن کامل

Optimization of Thermal Instability Resistance of FG Flat Structures using an Improved Multi-objective Harmony Search Algorithm

This paper presents a clear monograph on the optimization of thermal instability resistance of the FG (functionally graded) flat structures. For this aim, two FG flat structures, namely an FG beam and an FG circular plate, are considered. These structures are assumed to obey the first-order shear deformation theory, three-parameters power-law distribution of the constituents, and clamped bounda...

متن کامل

Bat Algorithm for Optimal Service Parameters in an Impatient Customer N-Policy Vacation Queue

In this paper, a meta-heuristic method, the Bat Algorithm, based on the echolocation behavior of bats is used to determine the optimum service rate of a queue problem. A finite buffer M/M/1 queue with N policy, multiple working vacations and Bernoulli schedule vacation interruption is considered. Under the two customers' impatient situations, balking and reneging, the...

متن کامل

Tuning of fuzzy logic controller using an improved black hole algorithm for maximizing power capture of ocean wave energy converters

Seas and oceans are the most important sources of renewable energy in the world. The main purpose of this paper is to use an appropriate control strategy to improve the performance of point absorbers. In this scheme, considering the high uncertainty in the parameters of the power take-off system in different atmospheric conditions, a new improved black hole algorithm is introduced to tune fuzzy...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997